Carlos Avendano , Sangita Tibrewala and Hynek Hermansky , " Multiresolution Channel Normalization for ASR in Reverberant
نویسندگان
چکیده
To overcome the problems related with the long impulse responses produced by reverberation, we use a long time window (high frequency resolution) analysis during the channel normalization steps of the feature extraction process in automatic speech recognition (ASR). After nor-malization, a trade between frequency and time resolution is used to increase the rate at which the time information is sampled (short-time domain), yielding an appropriate domain to derive ASR features. Experiments on data with reverberation times of about 0:5 s show that the new technique achieves signiicant performance improvement of a speech recognizer under reverberation, with only some performance degradation on clean speech.
منابع مشابه
Multiresolution channel normalization for ASR in reverberant environments
To overcome the problems related with the long impulse responses produced by reverberation, we use a long time window (high frequency resolution) analysis during the channel normalization steps of the feature extraction process in automatic speech recognition (ASR). After nor-malization, a trade between frequency and time resolution is used to increase the rate at which the time information is ...
متن کاملTowards ASR on partially corrupted speech
A new highly parallel approach to automatic recognition of speech, inspired by early Fletcher’s research on Articulation Index, and based on independent probability estimates in several sub-bands of the available speech spectrum, is presented. The approach is especially suitable for situations when part of the spectrum of speech is corrupted. In such cases, it can yield an order-of-magnitude im...
متن کاملTraps - Classifiers of Temporal
TRAPS CLASSIFIERS OF TEMPORAL PATTERNS Hynek Hermansky1;2 Sangita Sharma1 1Oregon Graduate Institute of Science and Technology, Portland, Oregon , USA. 2International Computer Science Institute, Berkeley, California, USA. Email: hynek,[email protected] ABSTRACT The work proposes a radically di erent set of features for ASR where TempoRAl Patterns of spectral energies are used in place of the ...
متن کاملMulti-band and adaptation approaches to robust speech recognition
In this paper we present two approaches to deal with degradation of automatic speech recognizers due to acoustic mismatch in training and testing environments. The rst approach is based on the multi-band approach to automatic speech recognition (ASR). This approach is shown to be inherently robust to frequency selective degradation. In the second approach, we present a conceptually simple unsup...
متن کاملData based filter design for RASTA-like channel normalization in ASR
RASTA processing has proven to be a successful technique for channel normalization in automatic speech recognition (ASR). We present two approaches to the design of RASTA-like filters from training data. One consists of finding the solution to a constrained optimization problem on the feature time trajectories while the other uses Linear Discriminant Analysis (LDA). Whereas LDA is often applied...
متن کامل